Goto

Collaborating Authors

 safety module


Towards Safe Robot Foundation Models

Tölle, Maximilian, Gruner, Theo, Palenicek, Daniel, Günster, Jonas, Liu, Puze, Watson, Joe, Tateo, Davide, Peters, Jan

arXiv.org Artificial Intelligence

Robot foundation models hold the potential for deployment across diverse environments, from industrial applications to household tasks. While current research focuses primarily on the policies' generalization capabilities across a variety of tasks, it fails to address safety, a critical requirement for deployment on real-world systems. In this paper, we introduce a safety layer designed to constrain the action space of any generalist policy appropriately. Our approach uses ATACOM, a safe reinforcement learning algorithm that creates a safe action space and, therefore, ensures safe state transitions. By extending ATACOM to generalist policies, our method facilitates their deployment in safety-critical scenarios without requiring any specific safety fine-tuning. We demonstrate the effectiveness of this safety layer in an air hockey environment, where it prevents a puck-hitting agent from colliding with its surroundings, a failure observed in generalist policies.


Reactive and Safety-Aware Path Replanning for Collaborative Applications

Tonola, Cesare, Faroni, Marco, Abdolshah, Saeed, Hamad, Mazin, Haddadin, Sami, Pedrocchi, Nicola, Beschi, Manuel

arXiv.org Artificial Intelligence

This paper addresses motion replanning in human-robot collaborative scenarios, emphasizing reactivity and safety-compliant efficiency. While existing human-aware motion planners are effective in structured environments, they often struggle with unpredictable human behavior, leading to safety measures that limit robot performance and throughput. In this study, we combine reactive path replanning and a safety-aware cost function, allowing the robot to adjust its path to changes in the human state. This solution reduces the execution time and the need for trajectory slowdowns without sacrificing safety. Simulations and real-world experiments show the method's effectiveness compared to standard human-robot cooperation approaches, with efficiency enhancements of up to 60\%.


PSA-VLM: Enhancing Vision-Language Model Safety through Progressive Concept-Bottleneck-Driven Alignment

Liu, Zhendong, Nie, Yuanbi, Tan, Yingshui, Liu, Jiaheng, Yue, Xiangyu, Cui, Qiushi, Wang, Chongjun, Zhu, Xiaoyong, Zheng, Bo

arXiv.org Artificial Intelligence

Benefiting from the powerful capabilities of Large Language Models (LLMs), pre-trained visual encoder models connected to LLMs form Vision Language Models (VLMs). However, recent research shows that the visual modality in VLMs is highly vulnerable, allowing attackers to bypass safety alignment in LLMs through visually transmitted content, launching harmful attacks. To address this challenge, we propose a progressive concept-based alignment strategy, PSA-VLM, which incorporates safety modules as concept bottlenecks to enhance visual modality safety alignment. By aligning model predictions with specific safety concepts, we improve defenses against risky images, enhancing explainability and controllability while minimally impacting general performance. Our method is obtained through two-stage training. The low computational cost of the first stage brings very effective performance improvement, and the fine-tuning of the language model in the second stage further improves the safety performance. Our method achieves state-of-the-art results on popular VLM safety benchmark.


SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation

Li, Mingjie, Si, Wai Man, Backes, Michael, Zhang, Yang, Wang, Yisen

arXiv.org Artificial Intelligence

As advancements in large language models (LLMs) continue and the demand for personalized models increases, parameter-efficient fine-tuning (PEFT) methods (e.g., LoRA) will become essential due to their efficiency in reducing computation costs. However, recent studies have raised alarming concerns that LoRA fine-tuning could potentially compromise the safety alignment in LLMs, posing significant risks for the model owner. In this paper, we first investigate the underlying mechanism by analyzing the changes in safety alignment related features before and after fine-tuning. Then, we propose a fixed safety module calculated by safety data and a task-specific initialization for trainable parameters in low-rank adaptations, termed Safety-alignment preserved Low-Rank Adaptation (SaLoRA). Unlike previous LoRA methods and their variants, SaLoRA enables targeted modifications to LLMs without disrupting their original alignments. Our experiments show that SaLoRA outperforms various adapters-based approaches across various evaluation metrics in different fine-tuning tasks.


Safe Planner: Empowering Safety Awareness in Large Pre-Trained Models for Robot Task Planning

Li, Siyuan, Ma, Zhe, Liu, Feifan, Lu, Jiani, Xiao, Qinqin, Sun, Kewu, Cui, Lingfei, Yang, Xirui, Liu, Peng, Wang, Xun

arXiv.org Artificial Intelligence

Robot task planning is an important problem for autonomous robots in long-horizon challenging tasks. As large pre-trained models have demonstrated superior planning ability, recent research investigates utilizing large models to achieve autonomous planning for robots in diverse tasks. However, since the large models are pre-trained with Internet data and lack the knowledge of real task scenes, large models as planners may make unsafe decisions that hurt the robots and the surrounding environments. To solve this challenge, we propose a novel Safe Planner framework, which empowers safety awareness in large pre-trained models to accomplish safe and executable planning. In this framework, we develop a safety prediction module to guide the high-level large model planner, and this safety module trained in a simulator can be effectively transferred to real-world tasks. The proposed Safe Planner framework is evaluated on both simulated environments and real robots. The experiment results demonstrate that Safe Planner not only achieves state-of-the-art task success rates, but also substantially improves safety during task execution. The experiment videos are shown in https://sites.google.com/view/safeplanner .


Safety Alignment for Vision Language Models

Liu, Zhendong, Nie, Yuanbi, Tan, Yingshui, Yue, Xiangyu, Cui, Qiushi, Wang, Chongjun, Zhu, Xiaoyong, Zheng, Bo

arXiv.org Artificial Intelligence

Benefiting from the powerful capabilities of Large Language Models (LLMs), pre-trained visual encoder models connected to an LLMs can realize Vision Language Models (VLMs). However, existing research shows that the visual modality of VLMs is vulnerable, with attackers easily bypassing LLMs' safety alignment through visual modality features to launch attacks. To address this issue, we enhance the existing VLMs' visual modality safety alignment by adding safety modules, including a safety projector, safety tokens, and a safety head, through a two-stage training process, effectively improving the model's defense against risky images. For example, building upon the LLaVA-v1.5 model, we achieve a safety score of 8.26, surpassing the GPT-4V on the Red Teaming Visual Language Models (RTVLM) benchmark. Our method boasts ease of use, high flexibility, and strong controllability, and it enhances safety while having minimal impact on the model's general performance. Moreover, our alignment strategy also uncovers some possible risky content within commonly used open-source multimodal datasets. Our code will be open sourced after the anonymous review.


SafeLight: A Reinforcement Learning Method toward Collision-free Traffic Signal Control

Du, Wenlu, Ye, Junyi, Gu, Jingyi, Li, Jing, Wei, Hua, Wang, Guiling

arXiv.org Artificial Intelligence

Traffic signal control is safety-critical for our daily life. Roughly one-quarter of road accidents in the U.S. happen at intersections due to problematic signal timing, urging the development of safety-oriented intersection control. However, existing studies on adaptive traffic signal control using reinforcement learning technologies have focused mainly on minimizing traffic delay but neglecting the potential exposure to unsafe conditions. We, for the first time, incorporate road safety standards as enforcement to ensure the safety of existing reinforcement learning methods, aiming toward operating intersections with zero collisions. We have proposed a safety-enhanced residual reinforcement learning method (SafeLight) and employed multiple optimization techniques, such as multi-objective loss function and reward shaping for better knowledge integration. Extensive experiments are conducted using both synthetic and real-world benchmark datasets. Results show that our method can significantly reduce collisions while increasing traffic mobility.


Safety Enhancement for Deep Reinforcement Learning in Autonomous Separation Assurance

Guo, Wei, Brittain, Marc, Wei, Peng

arXiv.org Artificial Intelligence

The separation assurance task will be extremely challenging for air traffic controllers in a complex and high density airspace environment. Deep reinforcement learning (DRL) was used to develop an autonomous separation assurance framework in our previous work where the learned model advised speed maneuvers. In order to improve the safety of this model in unseen environments with uncertainties, in this work we propose a safety module for DRL in autonomous separation assurance applications. The proposed module directly addresses both model uncertainty and state uncertainty to improve safety. Our safety module consists of two sub-modules: (1) the state safety sub-module is based on the execution-time data augmentation method to introduce state disturbances in the model input state; (2) the model safety sub-module is a Monte-Carlo dropout extension that learns the posterior distribution of the DRL model policy. We demonstrate the effectiveness of the two sub-modules in an open-source air traffic simulator with challenging environment settings. Through extensive numerical experiments, our results show that the proposed sub-safety modules help the DRL agent significantly improve its safety performance in an autonomous separation assurance task.


Toward Deep Reinforcement Learning Without a Simulator: An Autonomous Steering Example

Hilleli, Bar (Technion) | El-Yaniv, Ran (Technion)

AAAI Conferences

We propose a scheme for training a computerized agent to perform complex human tasks such as highway steering. The scheme is designed to follow a natural learning process whereby a human instructor teaches a computerized trainee. It enables leveraging the weak supervision abilities of a (human) instructor, who, while unable to perform well herself at the required task, can provide coherent and learnable instantaneous reward signals to the computerized trainee. The learning process consists of three supervised elements followed by reinforcement learning. The supervised learning stages are: (i) supervised imitation learning; (ii) supervised reward induction; and (iii) supervised safety module construction. We implemented this scheme using deep convolutional networks and applied it to successfully create a computerized agent capable of autonomous highway steering over the well-known racing game Assetto Corsa. We demonstrate that the use of all components is essential to effectively carry out reinforcement learning of the steering task using vision alone, without access to a driving simulator internals, and operating in wall-clock time.


Deep Learning of Robotic Tasks without a Simulator using Strong and Weak Human Supervision

Hilleli, Bar, El-Yaniv, Ran

arXiv.org Artificial Intelligence

We propose a scheme for training a computerized agent to perform complex human tasks such as highway steering. The scheme is designed to follow a natural learning process whereby a human instructor teaches a computerized trainee. The learning process consists of five elements: (i) unsupervised feature learning; (ii) supervised imitation learning; (iii) supervised reward induction; (iv) supervised safety module construction; and (v) reinforcement learning. We implemented the last four elements of the scheme using deep convolutional networks and applied it to successfully create a computerized agent capable of autonomous highway steering over the well-known racing game Assetto Corsa. We demonstrate that the use of the last four elements is essential to effectively carry out the steering task using vision alone, without access to a driving simulator internals, and operating in wall-clock time. This is made possible also through the introduction of a safety network, a novel way for preventing the agent from performing catastrophic mistakes during the reinforcement learning stage.